complete workflow to check sparql queries #396

DeleMike · 2024-10-16T23:18:48Z

Contributor checklist

This pull request is on a separate branch and not the main branch

Description

Adds steps in workflow file to run check_query_identifiers.py

Related issue

fixes Add workflow to check queries #339

github-actions · 2024-10-16T23:19:15Z

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. Also consider joining our bi-weekly Saturday dev syncs. It'd be great to have you!

Maintainer checklist

The linting and formatting workflow within the PR checks do not indicate new errors in the files changed
The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

…nment

DeleMike · 2024-10-17T05:01:23Z

@andrewtavis, I've finally made the workflow file to work! (omg!!)

but how should the expected behaviour be like? For now, it only logs to the console that the files that have QID issues but the workflow does not actually fail so all green checks.

How do we wanna approach this? Should the workflow fail? is there a way we could notify the PR author that something is wrong?

andrewtavis · 2024-10-17T06:27:39Z

The workflow should fail :) A big thing is that it's also a tool for maintainers to know if things are wrong in a PR, and and a failed workflow says that loud and clear. We were discussing warnings for GitHub actions in another issue, which apparently are a thing, but let's keep it all to if there's a problem fail the workflow.

DeleMike · 2024-10-17T07:35:06Z

Okay, I'll figure this out. Thanks for the update @andrewtavis!

DeleMike · 2024-10-17T14:07:45Z

@andrewtavis I have updated the codebase to cause the workflow to fail once we have invalid queries.

with this:

# Exit with an error code if any incorrect QIDs are found
if incorrect_languages or incorrect_data_types:
    sys.exit(1)

I believe with this, all checks are complete

andrewtavis · 2024-10-18T01:14:40Z

Quick note being sent to all the testing PRs, if updates are needed now that #402 has been merged, then it'd be great to get those updates to the branch :) If no updates are needed, then let me know 😊

DeleMike · 2024-10-18T08:04:49Z

Quick note being sent to all the testing PRs, if updates are needed now that #402 has been merged, then it'd be great to get those updates to the branch :) If no updates are needed, then let me know 😊

checking...

I noticed that there was no folder for Igbo.

added comparative

- Utilized already built helper functions to support sub-languages when retrieving ISO and QID values. - Updated table printing to correctly format and display both main languages and sub-languages.

… file

…tion to reflect the new JSON structure, ensuring only data types are printed and no sub-languages unlike before.

…_name' to align with the directory structure in the language_data_extraction directory.

…tion to handle sub_language folders.

…se list_all_languages, assigning a complete list of all languages.

…uages listing functions

- Updated all test cases to account for sub-languages. - Removed tests for est_get_language_words_to_remove and est_get_language_words_to_ignore, as these functions were deleted from utils.py and the languages metadata files

…. Made the language_metadata parameter optional in two functions. Added a ValueError exception when a language is not found.

- Positive and negative tests for format_sublanguage_name - Test to validate the output of list_all_languages

…ON structure - Updated the logic for building language_map and language_to_qid to handle languages with sub-languages. - Both main languages and sub-languages are now processed in a single pass, ensuring that: - language_map includes all metadata for main and sub-languages. - language_to_qid correctly maps both main and sub-languages to their QIDs.

…uages listing functions

…ON structure - Updated the logic for building language_map and language_to_qid to handle languages with sub-languages. - Both main languages and sub-languages are now processed in a single pass, ensuring that: - language_map includes all metadata for main and sub-languages. - language_to_qid correctly maps both main and sub-languages to their QIDs.

DeleMike · 2024-10-18T11:43:25Z

Hi @andrewtavis I have adjusted this PR to accommodate the changes from #402 and as you can see a workflow is failing. And that is the one to Check Query Identifiers.

For now, these are the files causing the issue (you can check the workflow message):

Incorrect Language QIDs found in the following files:
- /home/runner/work/Scribe-Data/Scribe-Data/src/scribe_data/language_data_extraction/Igbo/verbs/query_verbs.sparql
- /home/runner/work/Scribe-Data/Scribe-Data/src/scribe_data/language_data_extraction/Korean/verbs/query_verbs.sparql
- /home/runner/work/Scribe-Data/Scribe-Data/src/scribe_data/language_data_extraction/Korean/postpositions/query_postpositions.sparql
- /home/runner/work/Scribe-Data/Scribe-Data/src/scribe_data/language_data_extraction/Korean/adverbs/query_adverbs.sparql

They are failing because they do not exist in the language_metadata.json. Do I update the json file so that this workflow passes?

…st cases.

DeleMike · 2024-10-19T11:06:23Z

hi @andrewtavis 👋🏾
I have updated the language_metadata.json to support the Korean and Igbo languages.

I also updated the test cases to ensure we expect these languages

Now all test cases are passing :)

We can say that all QIDs are valid. Specifically, language QIDs and data type QIDs

andrewtavis

Really great work here, @DeleMike 😊 Appreciate your dedication to getting these tests up and running! 🚀

complete workflow to check sparql queries

ad54e29

DeleMike added 7 commits October 17, 2024 00:21

add function call to check queries

5faa2f4

update check_query_identifiers workflow file: activate virtual enviro…

c9c50d9

…nment

add working directory

1e04e4b

update workflow: fix file path

97f3243

reduce dependencies

2ee16bb

add pythonpath dependencies

92e4ad9

add workflow fix

042958e

andrewtavis added the hacktoberfest-accepted Accepted as a part of Hacktoberfest label Oct 17, 2024

andrewtavis self-requested a review October 17, 2024 07:29

Ebeleokolo and others added 7 commits October 17, 2024 14:19

Add Finnish verbs query

ac4a2ba

Updates to Finnish verbs query

ee5b034

throw error if invalid QIDs are found

3b9a61a

post comment if workflow fails

10e7a50

fix async block in workflow

1d6668b

give gh actions write access

2cdcc01

remove pr comment steps

eb0e3f2

DeleMike mentioned this pull request Oct 17, 2024

Moving from Old Language Metadata Structure to Support Sub-languages and Simplified JSON #402

Merged

1 task

GicharuElvis and others added 5 commits October 18, 2024 09:08

Added Swedish Adjectives

0a2d574

Create query_verbs.sparql

8f3425a

I noticed that there was no folder for Igbo.

Add Igbo to the languages check

5ffafb0

Remove label service from adjectives query

cac8dd6

Update query_adverbs.sparql

34d84d2

added comparative

OmarAI2003 and others added 21 commits October 18, 2024 09:08

Handle sub-languages in language table generation

bc65e0d

- Utilized already built helper functions to support sub-languages when retrieving ISO and QID values. - Updated table printing to correctly format and display both main languages and sub-languages.

adding new languages and their dialects to the language_metadata.json…

47ff4f8

… file

Modified the loop that searches languages in the list_data_types func…

f1f8928

…tion to reflect the new JSON structure, ensuring only data types are printed and no sub-languages unlike before.

Capitalize the languages returned by the function 'format_sublanguage…

5a4f721

…_name' to align with the directory structure in the language_data_extraction directory.

Implemented minor fixes by utilizing the format_sublanguage_name func…

eaf89e4

…tion to handle sub_language folders.

Updated the instance variable self.languages in ScribeDataConfig to u…

661d723

…se list_all_languages, assigning a complete list of all languages.

adding mandarin as a sub language under chinese and updating some qids

dffb9f7

Update test_list_languages to match updated output format

4a204c0

removing .capitalize method since it's already implemented inside lag…

0249c96

…uages listing functions

Updating test cases in test_list.py file to match newly added languages

a584749

Update test cases to include sub-languages

4ef0c22

- Updated all test cases to account for sub-languages. - Removed tests for est_get_language_words_to_remove and est_get_language_words_to_ignore, as these functions were deleted from utils.py and the languages metadata files

Updated the get_language_from_iso function to depend on the JSON file…

775fb24

…. Made the language_metadata parameter optional in two functions. Added a ValueError exception when a language is not found.

Add unit tests for language formatting and listing:

0b75b4e

- Positive and negative tests for format_sublanguage_name - Test to validate the output of list_all_languages

Edits to language metadata and supporting functions + pr checklist

ad61c66

removing .capitalize method since it's already implemented inside lag…

efb1f64

…uages listing functions

adjust is_valid_language function to suit new JSON structure

048c84f

Merge branch 'main' into fix/adjust-check-query-workflow

18d0747

fix failing tests and update docs

d814ecb

DeleMike added 2 commits October 19, 2024 11:51

fix failing workflow: add languages to workflow and update failing te…

c8214ff

…st cases.

fix failing tests

6517ffe

andrewtavis added 3 commits October 19, 2024 16:37

Merge branch 'main' into fix/adjust-check-query-workflow

2b9b1e1

Add Latvian to language metadata file

8586625

Add spacing and Latvian to testing

a975a6b

andrewtavis approved these changes Oct 19, 2024

View reviewed changes

andrewtavis merged commit 9d5c37c into scribe-org:main Oct 19, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

complete workflow to check sparql queries #396

complete workflow to check sparql queries #396

DeleMike commented Oct 16, 2024

github-actions bot commented Oct 16, 2024

DeleMike commented Oct 17, 2024

andrewtavis commented Oct 17, 2024

DeleMike commented Oct 17, 2024

DeleMike commented Oct 17, 2024

andrewtavis commented Oct 18, 2024

DeleMike commented Oct 18, 2024

DeleMike commented Oct 18, 2024 •

edited

Loading

DeleMike commented Oct 19, 2024

andrewtavis left a comment

complete workflow to check sparql queries #396

complete workflow to check sparql queries #396

Conversation

DeleMike commented Oct 16, 2024

Contributor checklist

Description

Related issue

github-actions bot commented Oct 16, 2024

Thank you for the pull request!

Maintainer checklist

DeleMike commented Oct 17, 2024

andrewtavis commented Oct 17, 2024

DeleMike commented Oct 17, 2024

DeleMike commented Oct 17, 2024

andrewtavis commented Oct 18, 2024

DeleMike commented Oct 18, 2024

DeleMike commented Oct 18, 2024 • edited Loading

DeleMike commented Oct 19, 2024

andrewtavis left a comment

Choose a reason for hiding this comment

DeleMike commented Oct 18, 2024 •

edited

Loading